1,500 research outputs found

    When Causal Intervention Meets Adversarial Examples and Image Masking for Deep Neural Networks

    Full text link
    Discovering and exploiting the causality in deep neural networks (DNNs) are crucial challenges for understanding and reasoning causal effects (CE) on an explainable visual model. "Intervention" has been widely used for recognizing a causal relation ontologically. In this paper, we propose a causal inference framework for visual reasoning via do-calculus. To study the intervention effects on pixel-level features for causal reasoning, we introduce pixel-wise masking and adversarial perturbation. In our framework, CE is calculated using features in a latent space and perturbed prediction from a DNN-based model. We further provide the first look into the characteristics of discovered CE of adversarially perturbed images generated by gradient-based methods \footnote{~~https://github.com/jjaacckkyy63/Causal-Intervention-AE-wAdvImg}. Experimental results show that CE is a competitive and robust index for understanding DNNs when compared with conventional methods such as class-activation mappings (CAMs) on the Chest X-Ray-14 dataset for human-interpretable feature(s) (e.g., symptom) reasoning. Moreover, CE holds promises for detecting adversarial examples as it possesses distinct characteristics in the presence of adversarial perturbations.Comment: Noted our camera-ready version has changed the title. "When Causal Intervention Meets Adversarial Examples and Image Masking for Deep Neural Networks" as the v3 official paper title in IEEE Proceeding. Please use it in your formal reference. Accepted at IEEE ICIP 2019. Pytorch code has released on https://github.com/jjaacckkyy63/Causal-Intervention-AE-wAdvIm

    Treatment Learning Causal Transformer for Noisy Image Classification

    Full text link
    Current top-notch deep learning (DL) based vision models are primarily based on exploring and exploiting the inherent correlations between training data samples and their associated labels. However, a known practical challenge is their degraded performance against "noisy" data, induced by different circumstances such as spurious correlations, irrelevant contexts, domain shift, and adversarial attacks. In this work, we incorporate this binary information of "existence of noise" as treatment into image classification tasks to improve prediction accuracy by jointly estimating their treatment effects. Motivated from causal variational inference, we propose a transformer-based architecture, Treatment Learning Causal Transformer (TLT), that uses a latent generative model to estimate robust feature representations from current observational input for noise image classification. Depending on the estimated noise level (modeled as a binary treatment factor), TLT assigns the corresponding inference network trained by the designed causal loss for prediction. We also create new noisy image datasets incorporating a wide range of noise factors (e.g., object masking, style transfer, and adversarial perturbation) for performance benchmarking. The superior performance of TLT in noisy image classification is further validated by several refutation evaluation metrics. As a by-product, TLT also improves visual salience methods for perceiving noisy images.Comment: Accepted to IEEE WACV 2023. The first version was finished in May 201

    Interpretable Self-Attention Temporal Reasoning for Driving Behavior Understanding

    Full text link
    Performing driving behaviors based on causal reasoning is essential to ensure driving safety. In this work, we investigated how state-of-the-art 3D Convolutional Neural Networks (CNNs) perform on classifying driving behaviors based on causal reasoning. We proposed a perturbation-based visual explanation method to inspect the models' performance visually. By examining the video attention saliency, we found that existing models could not precisely capture the causes (e.g., traffic light) of the specific action (e.g., stopping). Therefore, the Temporal Reasoning Block (TRB) was proposed and introduced to the models. With the TRB models, we achieved the accuracy of 86.3%\mathbf{86.3\%}, which outperform the state-of-the-art 3D CNNs from previous works. The attention saliency also demonstrated that TRB helped models focus on the causes more precisely. With both numerical and visual evaluations, we concluded that our proposed TRB models were able to provide accurate driving behavior prediction by learning the causal reasoning of the behaviors.Comment: Submitted to IEEE ICASSP 2020; Pytorch code will be released soo

    Adversarial Reweighting for Speaker Verification Fairness

    Full text link
    We address performance fairness for speaker verification using the adversarial reweighting (ARW) method. ARW is reformulated for speaker verification with metric learning, and shown to improve results across different subgroups of gender and nationality, without requiring annotation of subgroups in the training data. An adversarial network learns a weight for each training sample in the batch so that the main learner is forced to focus on poorly performing instances. Using a min-max optimization algorithm, this method improves overall speaker verification fairness. We present three different ARWformulations: accumulated pairwise similarity, pseudo-labeling, and pairwise weighting, and measure their performance in terms of equal error rate (EER) on the VoxCeleb corpus. Results show that the pairwise weighting method can achieve 1.08% overall EER, 1.25% for male and 0.67% for female speakers, with relative EER reductions of 7.7%, 10.1% and 3.0%, respectively. For nationality subgroups, the proposed algorithm showed 1.04% EER for US speakers, 0.76% for UK speakers, and 1.22% for all others. The absolute EER gap between gender groups was reduced from 0.70% to 0.58%, while the standard deviation over nationality groups decreased from 0.21 to 0.19

    Co-delivery of salinomycin and curcumin for cancer stem cell treatment by inhibition of cell proliferation, cell cycle arrest, and epithelial-mesenchymal transition

    Get PDF
    Malignant cancer is a devastating disease often associated with a poor clinical prognosis. For decades, modern drug discoveries have attempted to identify potential modulators that can impede tumor growth. Cancer stem cells however are more resistant to therapeutic intervention, which often leads to treatment failure and subsequent disease recurrence. Here in this study, we have developed a specific multi-target drug delivery nanoparticle system against breast cancer stem cells (BCSCs). Therapeutic agents curcumin and salinomycin have complementary functions of limiting therapeutic resistance and eliciting cellular death, respectively. By conjugation of CD44 cell-surface glycoprotein with poly(lactic-co-glycolic acid) (PLGA) nanoparticles that are loaded with curcumin and salinomycin, we investigated the cellular uptake of BCSCs, drug release, and therapeutic efficacy against BCSCs. We determined CD44-targeting co-delivery nanoparticles are highly efficacious against BCSCs by inducing G1 cell cycle arrest and limiting epithelial–mesenchymal transition. This curcumin and salinomycin co-delivery system can be an efficient treatment approach to target malignant cancer without the repercussion of disease recurrence

    Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting

    Full text link
    We explore the ability of large language models (LLMs) to act as speech recognition post-processors that perform rescoring and error correction. Our first focus is on instruction prompting to let LLMs perform these task without fine-tuning, for which we evaluate different prompting schemes, both zero- and few-shot in-context learning, and a novel task activation prompting method that combines causal instructions and demonstration to increase its context windows. Next, we show that rescoring only by in-context learning with frozen LLMs achieves results that are competitive with rescoring by domain-tuned LMs, using a pretrained first-pass recognition system and rescoring output on two out-of-domain tasks (ATIS and WSJ). By combining prompting techniques with fine-tuning we achieve error rates below the N-best oracle level, showcasing the generalization power of the LLMs.Comment: Accepted to IEEE Automatic Speech Recognition and Understanding (ASRU) 2023. 8 pages. 2nd version revised from Sep 29th's versio

    Vapor-Phase Stoichiometry and Heat Treatment of CdTe Starting Material for Physical Vapor Transport

    Get PDF
    Six batches of CdTe, having total amounts of material from 99 to 203 g and gross mole fraction of Te, X(sub Te), 0.499954-0.500138, were synthesized from pure Cd and Te elements. The vapor-phase stoichiometry of the assynthesized CdTe batches was determined from the partial pressure of Te2, P(sub Te2) using an optical absorption technique. The measured vapor compositions at 870 C were Te-rich for all of the batches with partial pressure ratios of Cd to Te2, P(sub Cd)/P(sub Te2), ranging from 0.00742 to 1.92. After the heat treatment of baking under dynamic vacuum at 870 C for 8 min, the vapor-phase compositions moved toward that of the congruent sublimation, i.e. P(sub Cd)/P(sub Te2) = 2.0, with the measured P(sub Cd)/P(sub Te2) varying from 1.84 to 3.47. The partial pressure measurements on one of the heat-treated samples also showed that the sample remained close to the congruent sublimation condition over the temperature range 800-880 C
    • …
    corecore